Available on CMS information server 



IEKP-KA/2001-23 
CMS NOTE 2001/054 




The Compact Muon Solenoid Experiment 

CMS Note 

Mailing address: CIVIS CERN, CH-1211 GENEVA 23, Switzerland 




15 November 2001 



Searching for Higgs Bosons in Association with 
Top Quark Pairs in the bb Decay Mode 



V. Drollinger and Th. Miiller 
lEKP, Karlsruhe University, Germany 

D. Denegri 

CERN. Geneva, Switzerland and DAPNIA Saclay, France 
Abstract 



Search for the Higgs Boson is one of the prime goals of the LHC. Higgs bosons lighter than 1 30 GeV jt? 
decay mainly to a 6-quark pair. While the detection of a directly produced Higgs boson in the fe& chan- 
nel is impossible because of the huge QCD background, the channel tlH'^ l^vqqbbbb is very 
promising in the Standard Model and the MSSM. 

We discuss an event reconstruction and selection method based on likelihood functions. The CMS 
detector response is performed with parametrisations obtained from detailed simulations. Various 
physics and detector performance scenarios are investigated and the results are presented. It turns out 
that excellent 6-tagging performance and good mass resolution are essential for this channel. 



1 Introduction 



The Higgs mechanism [|l]] is the generally accepted way to generate particle masses in the electroweak theory. If the 
Higgs boson is lighter than 130 GeV/ c^, it decays mainly to a hh pair [||. To observe the Higgs boson at the LHC, 
the ttH^ channel turns out to be the most promising channel among the Higgs production channels with iJ" — > bh 
decay [|[. In this study, we discuss the channel ttH^ l^vqqbbbb (Figure l]), where the Higgs Boson decays 
to bb, one top quark decays hadronically and the second one leptonically. The relevant signal and background 
cross sections at the LHC i^Spp — 14 TeV) and particle masses used in the simulation are listed in Table |l]. 



LO cross sections 


masses 


attm 'X BRHO^hb = 1.09- 0.32 p6 
attza = 0.65 pb 
c^tibb = 3.28 pb 
attjj = 507 pb 


mno = 100- 130 Gey/c2 
mzo = 91.187 Gey/c2 
TOfc = 4.62 GeV/c^ 
rat = 175 GeV/c^ 



Table 1: CompHEP ^ cross sections for signal and background relevant for the tiH'^ — > l^vqqbbbb channel, 
calculated with parton density function CTEQ41 The branching ratio of the semileptonic decay mode (one 
decays to quarks the other decays leptonically, where only decays to electrons or muons are taken into 
account) is 29% (not included in the cross sections of this table) and m^/± — 80.3427 GeVjf? . 

The hard processes are generated with CompHEP and then interfaced to PYTHIA, where the fragmentation and 
hadronisation are performed After the final state including the underlying event has been obtained, the 

CMS detector response is simulated, with track and jet reconstruction with parametrisations FATSIM and 
CMSJET obtaining in this way tracks, jets, leptons (the electron or muon reconstruction efficiency is assumed 
to be 90%; taus are not considered here) and missing transverse energy. These parametrisations have been obtained 
from detailed simulations based on GEANT. 




Figure 1: One example of a ttH'^ l^uqqbbbb event at LO. 



2 Reconstruction 

From Figure |l] we expect to find events with one isolated lepton, missing transverse energy EJp and six jets (four 
5-jets and two non-6-jets), but initial and final state radiation are sources of additional jets. So the number of jets 
per event is typically higher than six. On the other hand, not all six quarks of the hard process can be always 
recognised as individual jets in the detector, in which case it is impossible to reconstruct the event correctly - even 
if there are six or more jets. 

For the reconstruction of resonances it is necessary to assign the n jets of an event to the corresponding quarks 
of the hard process. In general, and ignoring information on 6-jets, the number of possible combinations N is 

1 



given in Table || as a function of the number of jets per event. We obtain N for the case, when the masses of 
the Higgs boson, both top quarks and the hadronically decaying W boson are reconstructed. The nominal mass 
of the leptonically decaying W boson, together with and the lepton four momentum, is used to calculate two 
solutions of the longitudinal momentum of the neutrino pz (ly) which is needed for the mass reconstruction of the 
leptonically decaying top. 

Good mass resolution and the identification of 6-jets is essential to reduce the number of wrong combinations in the 
event reconstruction. A good mass resolution can be obtained when the energy and direction of each reconstructed 
jet agree as closely as possible with the quantities of the corresponding parent quark. This can be achieved with 
jet corrections as described in [|o|l and |jTl|]. For 6-tagging we use ^-probabilities (||) and (see appendix) which 
depend on impact parameters of tracks and leptons inside the jets. They are determined using tt six jet events, as 
described in [0]. The identification of 6-jets is even more important for efficient background suppression. 
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Table 2: Number of jets per event n and the corresponding number of possible combinations N. If there are more 
than a dozen jets, only the twelve with highest Et are considered. 

Figure ^ shows the invariant mass distributions of the reconstructed resonances of ttH^ — > l^vqqbbhb events in 
the case of an ideal reconstruction: after the "preselection" and the calculation of pz{v) (see later on) each quark 
of the hard process is matched with exactly one jet, the closest one in R = + if AR{q, j) < 0.3 and if the 
jet energy is closer than ± 30 % to the parent quark energy. The mean values and widths of the top and W mass 
distributions are used to define likelihood functions used in the selection procedure described in the following. 
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Figure 2: Invariant resonance masses of the ttH^ l^uqqbhhh signal: Higgs boson, leptonic top, hadronic top 
and hadronic W^. The leptonic is not reconstructed but its nominal mass is used to calculate pz{v). The 
generated masses are: niHO — 115 GeV/c^, mt — 175 GeV / (? andmw± = 80.3427 GeV/c^ . 
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o Preselection 

Events are selected if there is an isolated lepton (e^ or with pT > 10 GeVj c within the tracker acceptance; no 
other track with pT > I GeV j c in a cone of 0.2 around the lepton) and at least six jets (Et > 20 GeV ,\ti\< 2.5). 

o Event Configuration 

In order to be able to reconstruct the Higgs mass, we have to find the correct event configuration among all possible 
combinations listed in Table|l The best configuration is defined as the one which gives the highest value of an event 
likelihood function ([T]) which takes into account 6-tagging of four jets, anti-fe-tagging of the two jets supposed to 
come from the hadronic W^, mass reconstruction of and the two top quarks, and sorting of the 6-jet energies. 

L.EVNT = n Pbih) X n [1 - Pb{q^)] X n ^-o.5xr-^f ^ nEt{t,t) - Eh{H°)] (1) 

i=lA i=l,2 i=W±,t,t 

The detailed version of this event hkehhood function can be found in the appendix. 
o Jet Combinations 

Events with more than six jets can contain gluon jets from final state radiation, which are not yet used in the 
analysis. The combination of these jets with the correct quark jets can improve the event reconstruction further 
The additional jets are combined with the decay products of both top quarks if they are closer than AR{j, j) < 1.7, 
if the corresponding mass is closer to the expected value of Figure |. If there are still jets left, they are considered 
as Higgs decay products and are combined with the closest of the corresponding two 6-jets, if AR{j, j) < 0.4. 

o Event Selection 

Three likehhood functions: for resonances (|) (L.RESO > 0.05), 6-tagging (||) (L_BTAG > 0.50), and kinemat- 
ics {LJilNE > 0.2) are used to reduce the fraction of background events. Finally, the events are counted in 
a mass window around the expected Higgs mass peak (niinv {j, j) in m ± 1 .9 cr ; to and a are obtained from mass 
distributions as shown in Figure^ with various generated Higgs masses). The likelihood cuts have been optimised 
assuming a Higgs mass of 120 GeV/ (?. 

The overall efficiency for a triggered event to be finally selected is 1.3% for tiH^ imjfa =115 GeV/c^), 0.2% 
for ttZ'^, 0.4% for ttbb and 0.003% for ttjj events. This shows that the reducible background is reduced very 
effectively. In addition, there is little combinatorial background left (an example is shown in Figure ||) with this 
reconstruction method. 




mi„,aj) [GeV/c'] 

Figure 3: Simulated invariant mass distribution of signal (dark shaded, m^ja = JJ5 GeV/c^) plus background 
for Lint = 30 fb^^. The dashed curve is obtained from the fit of the background without signal, the solid line 
describes the fit of signal plus background. 
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3 SM Results 



After the whole reconstruction and event selection procedure, it turns out that the irreducible background (with 
four real 6-jets) is dominant. Even the ttjj background, where only two 6-jets from the top decays are generated in 
the hard process, is dominated by events with four real 6-jets. This is possible after the fragmentation of PYTHIA: 
e.g. gg ttgg l^vqqhbghh with one hb pair coming from g ^ bb (gluon splitting). In this case the final 
state consists of nine partons or leptons which is one more than expected at LO and is therefore considered as HO 
(in this case NLO) process. Together with the number of ttbb events (considered as LO) we obtain an intrinsic 
k-factor k^ = 1.9 for all ttqq events. For the ttH° signal and the ttZ° background we assume two scenarios for 
kifffo, ttz° — k: LO (no k-factor) k = 1.0 and a more optimistic case k = 1.5. In the meanwhile a NLO calculation 
of the ttH^ cross section has been performed [[T^, where k w 1.2 at a central scale jj = (2mf + to^o)/2. For 
these two k-factor scenarios, the signal to background ratio S/ B, the significance S/ \fB for Lj„t = 30 /6^^, the 
integrated luminosity Li„t required for a significance of five or more and the precision on yt for Lint = 30 ,fb~^ 
are shown in Figure ^as a function of the generated Higgs mass. 
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Figure 4: S/B, S/VB, Unt (required for S/Vb = 5) and Ayt/yt versus generated Higgs mass in the SM. Two 
k-factor scenarios (Jctt/fo ^ — ^ •^^tt^g = 1-9) are shown: k= 1.0 (dots) andk= 1.5 (boxes). 

S/B is of the order of 50% or higher, the significance is relatively high already for Lint = 30 fb^^, and the 
significance is above five for a low integrated luminosity. An integrated luminosity Lint — 100 fb^^ would be 
enough to explore all points considered in Figure ^ Apart from these results, the Higgs mass can be determined 
from the Gaussian fit of the final mass distribution (see Figure]^) with a precision of better than 4%. Finally, the 
total event rate determines the top Higgs Yukawa coupling yt with a precision of around 15%, if we assume a 
known branching fraction of the decay H'^ bb. 
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4 MSSM Results 

To give an idea about the discovery potential of the corresponding channel tth^ l^vqqbbbb in the MSSM, 
we extrapolate the SM results (by rescaling the production cross section times branching ratio, obtained with 
SPYTHIA [13|]) and discuss the parameter space coverage of one benchmark scenario called "maximum nih" 
scenario [|l4] which turns out to be the most difficult scenario. The reason is the rapidly falling cross section and 
branching ratio with increasing Higgs mass, which limits the discovery potential of this channel in the SM as well. 
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FigureS: Discovery contours in the MSSM ("maximum nih" scenario) parameter space for Lint = 30 fb^^ (Isft) 
and for Lint — 100 fb^^ (right). S/ \Ib > 5 to the shaded side of the solid line. The dotted and dashed lines are 
the isomass curves formica = 125 GeV/c^ andnii^a = 115 GeV/c^, respectively. 



Figure 1^ shows the parameter space coverage in the m^i-tan /? plane for two integrated luminosities. In both cases 
there is an inaccessible region at low niA, whereas the second difficult region at high and tan /3 disappears 
with increased integrated luminosity. In other scenarios the difficult regions are smaller, which means that for 
sufficient integrated luminosity most of the MSSM parameter space can be covered with this channel. 



5 Some CMS Performance Considerations 

We have obtained the previous results by considering jets with |7?| < 2.5 and 6-tagging using both impact parameter 
measurements and the additional information on leptons (e^ or /i^) inside the jets. For this particular scenario 
the result is shown again in Table ^ (second line). In the same table we compare the situation, when some of this 
information is not available. The first line is the result for the case when the information of leptons inside jets 
is missing. The S/B is even somewhat higher, which means a higher purity, but the efficiency and the resulting 
significance are (not dramatically) lower. 



5-tagging scenario 


jet acceptance 


S 


B 


S/B 


s/Vb 


without lepton information 


hi < 2.5 


26 


31 


84% 


4.7 


with lepton information 


hi < 2.5 


38 


52 


73% 


5.3 


with lepton information 


hi < 2.0 


30 


41 


75% 


4.8 


with lepton information 


hi < 1-5 


20 


27 


73% 


3.8 



Table 3: Signal and background dependence on b-tagging scenario and jet acceptance, respectively. The numbers 
are given for Lint = 30 fb^^ , k = 1.5 and nin" = 115GeV/c^ in the SM. 



5 



In case of a reduced jet rj acceptance or tracker acceptance, respectively, signal and background are reduced in 
the same way. This gives practically constant S/B and decreasing S/^/B for smaller acceptances. The result 
of the third line {\ri\ < 2.0) is still good, but for an acceptance of \ri\ < 1.5 (last line) the result is significantly 
worse. Because the signal to background ratio is stable, these effects can be compensated with higher integrated 
luminosity. 



6 Conclusions 

After a detailed study |Ql we conclude that it is possible to reconstruct the tiH^ — > l^vqqhhhb signal without sig- 
nificant combinatorial background, although effects of event pile up have still to be evaluated. There are two basic 
requirements: good jet reconstruction which guarantees a good mass resolution and excellent 6-tagging perfor- 
mance which allows efficient and clean identification of 6-jets. This helps to reduce the background substantially. 

In the SM, a discovery is possible already after a short period of data taking at the LHC. The same is true in the 
MSSM, where most of the parameter space can be covered with the low integrated luminosity. 

Beside the discovery of the Higgs boson, measurements of the Higgs mass and of the top Higgs Yukawa coupling 
are possible with considerable precision, which is important to understand the nature of the Higgs boson. 

It is encouraging to see that in less favourable 6-tagging conditions and with reduced acceptance the reconstruction 
of this channel does not break down altogether and the same results can be obtained by just increasing the integrated 
luminosity. 

A 6-Quark Distributions 

Transverse energy and pseudorapidity distributions for 6-jets in tlH^ final states: Figure |[ 
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Figure 6: b-quark Et (left) and jryj (right) distributions obtained from the ttH^ l^vqqhbbh signal without cuts: 
b-quarks from Higgs decay with rrifja =115 GeV/c^ are shown in the upper plots and b-quarks from top decays 
with mt = 175 GeV/c^ are shown in the corresponding lower plots. 
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B 6-Probability Functions 

The 6-probability functions are used to define the HkeHhood functions and (^. If there is a lepton reconstructed 
inside the jet, the 6-probability is calculated from (^, otherwise the function for jets without leptons is used. 



B_PROB = arctan[2.249cr(ip) - 3.197 - 0.007709£;t 

- exp(0.7053 - 0.06249£;t)] x 0.2921 + 0.4877 

L_PROB = arctan[1.510cr(ip) - 1.394 - 0.008196£;t 

- exp(0.7624 - 0.08526£;t)] x 0.1026 + 0.8363 



(2) 



(3) 



C Likelihood Functions 

Likelihood functions which are used for the physics analysis of the ttH^ l^vqqhbhb channel are defined in the 
following expressions. (Q) is used to find the correct event configuration. The boldface variables represent the jets 
of an event. All possible combinations are checked: for instance, the jet with highest Et is treated as BTh (6- jet 
of hadronic top decay), then it is treated as BiH (6-jet "one" of the Higgs decay), then it is treated as Ji W ... All 
other likelihood functions are defined for one (the final) event configuration. 



L_EVNT = 6-probability[crip(BiH),£;T(BiH)] 
X 6-probabihty[cr,p(B2H), £:t(B2H)] 
X 6-probabiHty[a,p(BTL), £;t(BTl)] 
X 6-probabiUty [(Tjp (BTh ), (BTh)] 
X (1 - 6-probability[|a,p(JiW)|,ST(JiW)]) 
X (1 - 6-probability[|a,p(J2W)|,ST(J2W)]) 

xexp[-0.5x{--(^^';^Y^-^^^-^n (4) 
xexp[-0.5x{--(-^-^^';-^)-^"-^n 

X exp[-0.5 X {^...(BTn'jrW.J.W)- 171.4 

^ 10.8 ^ ^ 

4[i^(BT0 + g(BTH) - EiB^a.) - EjB^Yi.)] \ 11 
S(BTl) + ^(BTh) + S(BiH)+^(B2H) ; 7r^2 



L_RESO = exp[-0.5 x ^ rn^n.iBT.M - ^ 

X exp[-0.5 X { ^ Y ] (5) 

X exp[-0.5 X { — y ] 



L.BTAG = &-probability[crip(Bifl"),£;T(BiH)] 
X 6-probability[a,p(S2i/),ST(52ff)] 

(o) 

X 6-probabiUty[a,p(SrL), £;T(BTi)] 
X 6-probabiUty[a,p(STj^), Et{BTh)] 



1 



L.KINE 



EriBiH, B2H, BTl,1, v., BTh, JiW, J2W) 



EriB.H, B2H)A 
^,2Et{B,H) + J:^ 



Et{BTl,1 v) + Et{BTu. JiW, J2W) 



£;*?* (ECAL+HCAL+VFCAL) 

^ Et{BTl,1,v) ^ Et{BTh,JiW,J2W) 
E{BiH,B2H) ^ E{BTl,1,v) ^ E{BTh,JiW,J2W) 



Et{BiH, B2H) 



(7) 
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